Generalized Lanczos Algorithm for Variational Quantum Monte Carlo 
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We show that the standard Lanczos algorithm can be efficiently implemented statistically and 
self consistently improved, using the stochastic reconfiguration method, which has been recently 
introduced to stabilize the Monte Carlo sign problem instability. With this scheme a few Lanczos 
steps over a given variational wavefunction are possible even for large size as a particular case of 
a more general and more accurate technique that allows to obtain lower variational energies. This 
method has been tested extensively for a strongly correlated model like the t-J model. With the 
standard Lanczos technique it is possible to compute any kind of correlation functions, with no 
particular computational effort. By using that the variance < H^ > — < H >^ is zero for an exact 
eigenstate, we show that the approach to the exact solution with few Lanczos iterations is indeed 
possible even for ~ 100 electrons for reasonably good initial wavefunctions. 

The variational stochastic reconfiguration technique presented here allows in general a many- 
parameter energy optimization of any computable many-body wavefunction, including for instance 
generic long range Jastrow factors and arbitrary site dependent orbital determinants. This scheme 
improves further the accuracy of the calculation, especially for long distance correlation functions. 
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I. INTRODUCTION 

The GFMCSR was introduced recentlyH as a method 
to stabilize the sign problem instability in the Monte 
Carlo simulation, for -so called- projection techniques, 
aiming to determine the ground state (GS) wavefunction 
Tpo of a given Hamiltonian H. 

In the statistical approach the electron GS wavefunc- 
tion is sampled over a set of configurations {|a; >}, de- 
noting spins and electron positions , and belonging to 
a complete basis with large or even infinite dimension. 
The ground state component '0o of a given trial state 
ipG- henceforth assumed to be non zero on each configu- 
ration X- is filtered out by applying to it some projection 
operator G" for large number n of power iterations. The 
matrix G maybe given by G = e^^ for short imagi- 
nary time At, as is the case for the conventional diffusion 
Monte Carlo for continuous models, or the one that will 
be considered in the following sections: 



G = (A/ - i?)* 



(1) 



for lattice models, where a suitably large constant shift 
A allows convergence to the ground state (A = is used 
for the t-J model considered later on). The integer kp, 
determining the number of powers of the Hamiltonian in 
the Green function, maybe in principle larger than oneH, 
but in the following we avoid this complication and we 



consider kp fixed at its minimum value: 



Kp — i. 



In order to perform stable simulations with large sig- 
nal to noise ratio even for large n the many-body prop- 
agation ipn -^ G'^IV'G > needs to be stabilized at each 
iteration, using an approximation that can be efficiently 
implemented statistically Iw means of the stochastic re- 
configuration (SR) scheme.D 

The essential step in the SR is to replace the many- 
body state ipn+iix) — Gipm obtained by applying to ipn 



the exact Green function G, ' with the approximate state 
"^'n+ii^)-, determined by the following conditions. A given 
set of p-f 1 operators { O'^ } , not restricted to be Hermitian 
operator, satisfy the following equalities: 

< Vg|0^|^;+i >=< ^G|Ol^n+i > for fc = 0, • • -p 

(2) 

where O" for fc = being the identity here for simpler 
and more compact notations, V'n-i-i = fxip'^{x), ^^ be- 
ing a reference state known exactly (e.g. the standard 
variational approach) or statistically (e.g. the fixed node 
approximate state) on each configuration x and 



Y^auOl 



fc=0 



where O,, 



< '4:g\0^\x > 



(3) 
(4) 



and the constants a^ are determined by the conditions 



k 



\l < ^gIo'I^p^ 



n+l 



> 



where the covariance matrix is given by 



(5) 



(6) 



In this way the SR can be considered a projection Psr 
of the exactly propagated wavefunction ^pn+i — Gipn 
onto a subspace spanned by the states \tp'f >, defined 
by < a;|'0f >= 0^4''^{x), for k — 0, •••p. Notice also 
that only when tpf — ipc, V'f = (O*^"!^! 



'ItpG > , namely tpj 



is obtained by applying the operator {O'')^ to the state 

-00 ■ 

Solving the linear system determined by the SR con- 
ditions (y) we obtain a close expression for the linear 
operator Psbj which explicitly depends on the state ipc 
and the reference state ip^: 



P' 



SR 



t,3 



10} X 0g|O^' 



(7) 



The operator Psr is not a true projection operator, 
Though it satisfies the simple requirement (i) Pgj^ — 
Psr, (ii) P^r = Psr is not generally satisfied. The way 
to implement statistically this "pseudo" -projection oper- 
ator in, the large number of walker limit, was discussed 
in ref.cl. 

After each reconfiguration the state ^n is replaced by 
tp'n = Psr tpm but also the referenca state ip^ is changed. 
In fact in the original formulatioiU'El also the reference 
state explicitly depends on n and is updated after each 
SR: 



iplix) -^ sgn {'il;G{x))\'il;'^{x)\. 



(8) 



The reason of the choice (g) is to optimize the reference 
state and obtain more accurate results, as explained in 
the forthcoming paragraphs. 

With the restriction of a stable simulation the signs 
of the reference wavefunction have to be fixed, otherwise 
the average sign will drop exponentially to zero in the 
simulation. Therefore in Eq. (0) ip^ has been restricted to 
have the same signs (or the same nodes) of the guiding 
wavefunction tpQ. On the other hand the amplitude of 
the reference wavefunction 0-' can be considerably im- 
proved from our best variational guess ipc, since during 
the exact dynamic the state ipn gets closer to the ground 
state wavefunction. The optimal choice for the ampli- 
tudes has naturally led to the definition (ph. 

This technique, with the choice (^, has been shown 
to be remarkably accurate for frustrated spin or bo- 
son systems, allowing in many test-cases to obtain es- 
sentially exact results within statistical errors JjQ How- 
ever for fermion problems the situation is much different. 
Though this technique allows a significant improizement 
in the energy and correlation function calculationsOo; the 
bias remains still sizable and difficult to eliminate com- 
pletely by increasing the number of correlation functions 
used in the SR technique. Due to the antisymmetry of 
the fermion many body wavefunction it appears that the 
nodes in this case play a much more important role. 

It is instructive to consider the case of continuous 
models. In this case, for fermion systems, nodes have 
to appear in the many-body wavefunction just by anti- 
symmetry considerations. On the other hand symmetry 
alone does not restrict the nodal surface (the locus where 
the wavefunction 4'o{x) vanishes), implying that correla- 
tion effects can significantly change the nodal surface. In 
this case it maybe useless or irrelevant to improve the 



amplitudes without changing the nodes in the reference 
wavefunction (^). 

For fermion systems, due to the above difficulty to 
determine a nodal surface which is weakly modified by 
correlation effects, it appears at the moment difficult to 
avoid a sizable "nodal" error in energies and correlation 
functions. Therefore in this case it is extremely impor- 
tant that any approximation is at least controlled by the 
variational principle. In this. way an approach different 
from the one proposed beforeEl can be used. The first step 
to obtain a rigorous variational method is to consider the 
reference wavefunction ^p^ fixed to the variational or the 
fixed node state.ffl 



II. FIXED REFERENCE DYNAMIC 

In the GFMC propagation a large number M of walk- 
ers is used, the f^ walker is characterized by two weights 
Wj,Wj, acting on a configuration state Xj, e.g. a state 
with definite electron positions and spins. The reference 
weights w!j sample statistically the reference state "0^ (x) 
whereas the weights Wj refers to the state ipnix) propa- 
gated by the exact Green function ipn = Gipn^i. More 
precisely, taking into account the importance sampling 
transformationnl: 



<<wj4.x, » =0^(x)0G(a;) 

<< WjSa:,x, >> = 'lpn{x)lpG{x) 



(9) 
(10) 
(11) 



where the brackets <<>> indicate both the statistical 
average and the one over the number of walkers, at a 
given Markov iteration n. The two wavefunctions 0^ and 
0„ are propagated, using the statistical approach. For 
the first state a reference Green function G^., ^ with all 
positive matrix elements is used, whereas the latter one 
is propagated by means of the exact Green function (Q) 
which is related to the reference one by a simple relation 
Gx',x = Sx'^xG^., ^ with finite and known matrix elements 

'^x' ,x- 

1pn+lix')4>Gix') ='^Gx',x-ipn{x)'ipG{x) (12) 

X 

0^+i(a:')0G(x') = Y. G{,.,0^(a:)0G(x) (13) 

X 

The reference Green function G^, ^ = Px',xbx is writ- 
ten in terms of a stochastic matrix Px',x {Px'.x > and 
X^a;' Px',x = 1) times an x dependent normalization factor 
bx = X^k' ^x' xi ^'^ that a statistical implementation of 
the iteration dl2) is possible. Namely given the configura- 
tion Xj of the li walker a new configuration x' is selected 
statistically with probability p^,' .j. . (notice i^Px'.,x = 1 

x' 

by definition): 



x^ 



„ — a;' with probability Px',xj- 
Then the reference weight: 



„/ 



.,/ 



(14) 



(15) 



is scaled by the normalization factor bx , whereas the 
weight related to the exact Green function G, 



Wi 



(16) 



is further multiplied by the Sx' x ■ rnatrix element. 

Accordingto the above Markov iteration , defined 
by Eqs.(|lJ-^ all the walkers propagate independently 
each others, The reference weights w-^ remain positive, 
whereas the ones Wj^ related to the exact propagation, 
accumulate many sign changes , leading, for large n, to 
an exponential increase of the signal-to noise ratio. This 
is in fact determined by the corresponding exponential 



drop of the average walker sign < s >= 



(the in- 



famous " sign problem" ) . The stochastic reconfiguration 
allows to alleviate this disease, by implementing statisti- 
cally the operator Psr described in the previous section. 
In practice the weights Wj are replaced by approximate 
weights Pj, but with much larger average sign < s >. 
The weights pj sample statistically V'n(a;)V'G(a^) and are 
determined by solving the corresponding linear system 
(0). This is done at a given Markov iteration by averag- 
ing Eq.(0) only over the walker samples. This means that 
there exists a bias due to the statistical uncertainty of the 
quantum averages in, Eq. (0) . This bias however can be 
controlled efhcientlya because for large number of walk- 
ers it vanishes as 1/M. In this limit Vx in Eq.(H) depends 
only on the configuration x as the statistical fluctuations 
of the constants ak can be neglected for M >> p. 

Another method to control much more efficiently the 
statistical fluctuations of the constants {ak} , without us- 
ing a too large number of walkers, is to perform the SR 
scheme only after applying, at each iteration n a large 
number if, of statistical steps defined in Eqs.dl4|lfl) In 
this way the statistical averages in Eq. (||) have only small 
fluctuations c>c -^= that can be neglected in the limit of 
large bin length Lb, even when a small number of walkers 
is used. In each bin we are considering only the iteration 
n ^ n -|- 1 of the power method when the parameters 
ak of the wavefunction PsB.ipn are known and computed 

in the previous iteration. Thus "^ w^ f evolve statisti- 
cally according to G^ (Eqs.nj,n5[), and new configura- 
tions {xj} are generated for the statistical sampling of 



=<< wj4 



>>, whereas, inside the bin, 

_ / 



the weights {wj} are always initialized to 
i.e. with the same configurations and a simple scaling of 
the weights it is possible to sample the propagated state 
PsRipn in all the Lh statistical steps. Then at each step 
inside the bin the exact Green function G is applied sta- 
tistically (Eqjl6[) in order to sample ipn+i — GPsRij^m 



which is required to calculate the RHS of the SR condi- 
tions (g) . At the end of the bin new ak can be computed 
by solving the linear system (g). It is understood that 
even inside the bin a branching scheme at fixed number 
of walkeraZI and with a fixed number of correcting factors 
can be used to improve importance sampling. Whenever 
the length of the bin is so large that the statistical fluc- 
tuations can be neglected, the algorithm is deterministic 
inside the statistical uncertainties, and becom^ much 
more efficient compared to the original schemeElD, where 
the SR conditions where applied at each step (L^ = 1). 
We have found instead that the most efficient scheme is 
to change these constants ak only when the SR condi- 
tions (0) are not satisfied within statistical errors (e.g. 
they are off by more than three error bars), by increas- 
ing systematically the bin length at each iteration. This 
scheme allows also to eliminate the bias due to the finite 
number of walkers M , which was rather sizable in the 
original formulationB. 

After the SR, say at the iteration no, in order to con- 
tinue the power method iteration for n > no, the new 
approximate state ip'^^ = Psr V'rao replaces ipno but the 
reference state -0^ is still arbitrary. Inajtead of changing 
the reference weights with the choiceS'B Wj — \pj\, (im- 
plying EqS in the large number of walker limits) it is 
instead possible to remain with the same reference state 
ipna^ without changing it in the statistical sense. This is 
obtained by the following simple scheme. After the SR 
new configurations x'l — Xj/^i) are selected among the old 

ones Xn according to the probability 11,- = ■r- l'^, , . This 

scheme naturally deffiies the random index table j{i)W, 
used to improve importance sampling- as in the branch- 
ing scheme for the standard diffusion Monte Carloliil-. 
and allows to continue the simulation more efficiently 
with equally weighted walkers \w[\ ~ Const.. In fact in 
order to sample statistically the states tp'^ and 4''^ with 

corresponding new weights w'l and wi : 

i>'^{x)ijGix) = << PiSx,x, >> = << w'^Sx^x'^ » 

?/;,{(x)V'G(a;) = « w{6x,x, »=« wf 6x^x[ » (17) 

it is enough to use the so called reweighting method, 
which makes the above equations exact in the statistical 
sense: 



w', = w Sgnpj(,) 

w{ =u}U.J(,)/|pj(,)| 



w, 



(18) 
(19) 



where w — jj'^j \Pj\ is a constant common to all the 
walker weights and Sgna = ±1 represents the sign of the 
number a. The correcting factor w is taken into account 
only for a finite number L of past iterations, starting e.g. 
from n — i -t- 1 , otherwise the weights of the walkers may 
increase or decrease exponentially leading to a divergent 
variance. This introduces a systematic bias that vanishes 



however exponentially in L and decreases as 1/Af .□ In 
practice for large number of walkers it is enough to con- 
sider only few (or even none) "correcting" factors in the 
statistical averages-as common practice in Green Func- 
tion Monte Carlo.tll From the reweighting method it is 
also clear that the choice of the probability function lij 
is not restricted to be proportional to pj. In particular 
we have found more convenient to use the weights corre- 
sponding to GV^n determined after applying Eq.(p^, so 
that the choice 11, = . T' , further improves importance 

sampling, with a minor change in EqsdTq,n3). 
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„/' 



with w = ^^ — • 

Using the previous scheme the reference state equili- 
brates necessarily to the largest right eigenstate of the 
reference matrix G^ . At equilibrium the state ipSR has 
therefore a very well compact and clear definition. It 
represents the maximum right eigenstate of the matrix 



PsR (A/ - H) PsR, 



(20) 



with given ip' in the definition of Psr (0) ■ 

The method is therefore rigorously variational provided 
the pseudo projector Psn-is a true projector operator, 
namely for tp-f = ^q when Pg^ = PsR^ a-s implied by 
Eq.(0). In this case in fact by standard linear algebra, 
the maximum right eigenstate of the operator ( pO[ ) is the 
best variational state of H belonging to the subspace de- 
fined by the projector PsRi (see Appendix H). 



A. Variational energy when Psr = -Pg^ 

In the original formulation of the SR the reference 
Green function was defined withj-a slight generalization 
of the fixed node Green functional. The Green function 
proposed by Hellberg and Manusakist3 is instead more 
appropriate in this context and much more convenient 
from the practical point of view of reducing statistical 
fluctuations: 



G 



■f 



1 



|A4',x-V'G(a;')-ff=.',x/^G(a;)| (21) 



where 



Yl l^^x'.x - iI'g{x')Hx>^x/'iJjg{x)\, \a\ 



ing the absolute value of the number a. It is simple to 
show that, by applying the power method with the above 
Green function, convergence is reached when the maxi- 
mum right eigenvector ijjg{xY is filtered out: 



Thus the above Green function can be used to generate 
statistically configurations ( see Eqs. |l4| , [l5| ) distributed 
according to ipcixY with a stochastic matrix Px'^x = 

^x' ,x^x' I Zx- 

The advantage of using the reference Green function 
( pi| ) is evident when we consider its very simple relation 
with the exact Green function, namely: 



,x — ^x' ,xl^x' x — ^-^x 



/ _ 



(23) 



where the sign ± is given by the sign of the Green func- 
tion matrix element K5x' ,x — ^g{x')Hxi^x/''Pg{x)^ and de- 
pends of course on x and x' . Using the fixed reference al- 
gorithm in Eq. ( pi)| ) '4)^ — ^g and the operator Psr repre- 
sents in this special case a true projector one Pg^ = Psr- 
Thus in this case the method is rigorously variational as 
pointed out in the last part of the previous section. 

We notice also an important property of this method. 
If in the SR conditions (0) only operators defined by pow- 
ers of the hamiltonian O^ — H^ are used, the projector 
Psr acts on the same Krilov basis (spanned by H^\'ipG > 
,fc = 0, . . . ,p) of the well known Lanczos algorithm. Thus 
{PsrG Psr)^\iPg > filters out the lowest energy varia- 
tional state in this Krilov basis, i.e. by definition the 
state obtained by applying p Lanczos iterations to ipG- 
We recover in particular a known property of the Lanczos 
algorithm, valid also for the SR method: the method is 
exact if p equals the dimension of the Hilbert space. 

Due to the equivalence of the Lanczos algorithm with 
the SR technique it is clear why, with the latter tech- 
nique, it is possible to obtain rather Epnd|-approximate 
ground state with p reasonably small.ll3'0i3 In fact the 
convergencG-Qf the Lanczos algorithm is at least expo- 
nential in pJU 

We have therefore derived that the Lanczos algorithm 
can be implemented statistically using the SR method. 
This allows to perform easily two Lanczos iterations on 
a given variational wavefunction for fairly large size sys- 
tems. Furthermore the SR method allows to put sev- 
eral correlation functions in the (0). Since the method 
is strictly variational, the variational energy has neces- 
sarily to decrease by increasing the mentioned number of 
correlation functions. 



B. Improving the variational energy 



FollowingEj, by applying a finite number of exact 
Green function iterations to the wavefunction iJjsRi the 
corresponding quantum average. 



^SR 






(24) 






Gi,,i,G{xf=i'G{xy 



(22) 



remains obviously variational for any k. 

Taking into account 2fc statistical factors Sx',x, the 
above quantum averages can be statistically evaluated 



with the same Markov chain for which Esr (k = 0) is 
computed. 

The sign problem can be controUed for not too large k 
and systematically improved variational energies can be 
obtained compared to the fc = result. However experi- 
ence has shown that it is very difhcult to have significant 
improvement over the fc = result for large system size. 



bound the ground state energy. 

A compromise between the two methods, is to intro- 
duce a parameter r that interpolates smoothly the two 
limits. This parameter enters in the reweighting relation 
(ly) in a very simple manner: 



w. 



f 



w, 



(26) 



Notice that, when we release the fixed reference dy- 
namic for r ^ and when for large n the constants 
{uk} converge to non zero values, the reference wave- 
function i/)' is not given by averaging the configura- 
tions with the weights w-^ since the relation ipsR = 
PsRfpnix) = Txip^ {x) is implied by the definition of the 
projector Psr in Eq.(0). In fact, by using Eq.(|2(^) and 
that << w[5x,x, >>= tI'sr{x)'iPg{x) = rxip^ {x)'iPg{x), it 
is simple to derive that: 

<< w{ Sx,xi » = \rx\''~^Sgnrx « w'^6a:,x, » 
= kajT'^Sgnr^, Vsfl(a;)V'G(a;) 
= |r.rV'^(x)^G(x) (27) 

For r ~ the method is rigorously variational, in the 
sense described before. It is reasonable to expect a sim- 
ilar behavior for r << 1 when Pgj^ — Psr, r = 1 being 
the optimal choice for the amplitudes of the reference 
wavefunction ip^ . 



For r ^ we assume that the infinite number of walk- 
ers or large bin Lb is taken so that the parameters ak 
in the definition of r^ (^ can be considered constants 
for large n. In this case, if we take into account the 
reweighting (Cq), the reference wavefunction ip^ is ob- 
tained as the right eigenvector iPr{x) —<< w{ 5x,xi » 
with maximum eigenvalue of the renormalized reference 
Green function G-^: 



III. VARIATIONAL ENERGY WHEN Psr / P]^ 

We have seen that by changing the definition of the ref- 
erence weight after the SR was applied (^9|) the method 
is rigorously variational. However , as we have explained 
in the introduction, a better choice is to continue after 
the SR with w-^ = \w'\. The rational of this choice being 
that the wavefunction tp' — Psr ip has much better am- 
plitudes than the variational wavefunction ipc- This al- 
lows to improve self-consistently the reference wavefunc- 
tion in order to be as close as possible to the true ground 
state. In this way however ip'^ 7^ V'G and the method 
is no longer variational in the sense that the SR state 
defined by the right eigenstate with maximum modulus 
I A — Esr\ eigenvalue of the matrix : 

Psr (A/ - H) Psr.Hsr >= (A - Esr)\4'sr > (25) 

is no longer the lowest energy one in the basis defined 

by \ipH > ,k = 0,1, ... ,p and Esr does not necessarily ^Psr'; 



Gi, 



'Gi, 



(28) 



, namely as tpfix) = |r^| '^%Pr{x)/iPg{x) 

Then the SR state iPsr{x) — r^ip^ {x) is simply given 
in terms of this right eigenvector: 



iPsr{x) == RxJPr{x) 
with Rx = Irxl^^'^/ipcix) Sgnr^ 



(29) 
(30) 



Thus even when r j^ the SR state can be uniquely 
determined. It is also clear that, since rx is not neces- 
sarily positive or negative, the nodes can be changed and 
improved with respect to the nodes of the initial guess 
i/jg, both for r = with the standard Lanczos algorithm 
and for r > 0. If the nodes of ipG are exact and the 
hamiltonian is not frustrated it is also possible to show 
that r = 1 and A — j 00 provide the exact result, with r^, 
positive definite. 

It is possible to compute correlation functions over 



As it is shown in App.(H) the answer is yes, and not 
only for correlation functions diagonal in the basis x , as 
in the standard forward walking techniqueQ, but also for 
all the ones with off diagonal elements contained in the 
non-zero hamiltonian matrix elements. In particular it is 
possible to compute : 

< iPsrIHI^sr >> Eo 

i.e. the expectation value over the state defined by the 
SR conditions. This estimate is obviously variational and 
can be further improved, by applying a finite number of 
exact Green function powers to the right and to the Left 
of the hamiltonian ,as in the power-Lanczos algorithmlij, 
with the difference that in this case Eq.(24) has to be 
evaluated with the " forward walking technique" , as de- 
scribed in the App.(p|). In Fig.(||) we plot the evolution of 
the expectation value of the energy over the state ^psR as 
a function of the number of iterations n, required by the 
forward walking to filter out from ipG its component over 
ipSR, leading to a true variational energy estimate, 
see that for r = 1, within the original SR techniquelH 
the energy expectation value can be much higher than 
the corresponding "mixed average" estimate {n — 0). 

This behavior can be understood in the following way 
, for r — I and for A ^ cx3 there is no way to improve 
the sign of the wavefunction over ipG because r^, — + 1 
(G and G^ tend to the identity up to a constant and so 
the correction rx to ip^ ~ i/j for r — 1 has to became 



unity up to 0{j^)), whereas for r = the Lanczos algo- 
rithm , which is A independent, certainly modifies and 
improves the nodes. For fermion systems therefore it ap- 
pears important to work with small r because the nodes 
of the wavefunction play a particular important role in 
determining a good variational energy. 

A different behavior is seen for correlation functions. 
For large J/t = 1 when a four particle bound state is 
likely to be formed, our Jastrow-Slater wavefunction is 
not appropriate enough, and the large distance behavior 
is not exactly reproduced (see Fig.nl) even when we apply 
to it a couple of Lanczos iterations, that, remarkably, 
provide a very accurate variational energy (see table). 

That the qualitative behavior of this correlation func- 
tion is different from the variational starting wavefunc- 
tion, can be understood only when the algorithm with 
r > or the FN (which is worse in energy) are used. Es- 
pecially successfuU is the original technique r = 1, which 
improves by a factor of two the important long range 
behavior of N{\R\), clearly displaying the features of a 
genuine bound state, by a decaying probability to have 
holes at larger and larger distances. For correlation func- 
tions diagonal in the chosen basis the nodes do not play 
any role and r = 1 or the FN itself, are likely to pro- 
vide much better correlation functions. However from 
the previous argument about the impossibility to correct 
the nodes for r — 1 and G,G^ ^ I we expect indeed that 
for large size the Green function G tends to the identity, 
either because A ex L, as required by the power method 
to converge, or because the gap to the first excited state 
decreases and a power iteration V'n-i-i = Gipn is less and 
less effective for changing the wavefunction. Thus r has 
to scale to zero for large size if we do not want to spoil 
too much the variational expectation value of the energy 
(see Fig.0). It is remarkable that the gain in variational 
energy is larger and larger when the size is increased. 
Thus the r > technique seems to overcome, at least 
partially, a serious limitation of the Lanczos algorithm, 
namely that in the thermodynamic limit the energy per 
site cannot be improved by a technique which is not size 
consistent. The gain in energy with r > can instead 
be size consistent, since, as shown in the appendix (^), 
r > corresponds to modify the reference Green function 
G^ -^ G-^ , similarly to what the Fixed node algorithm- 
which is size consistent for size consistent ■0q- does. 
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FIG. 1. Hole-hole correlations for the 26 site t — J clus- 
ter for various methods. "SR standard"p.uidicates the original 
GFMCSR implementation (r = 1 here)El'B, ,whereas "SR best 
energy" indicates the optimal variational SR- wavefunction 
obtained with r — 0.25. The "VMC " is the lowest energy 
Jastrow-Slater variational singlet wavefunction as discussed in 
Sec.(K]), FN-|-Lanczos step the fixed node over the one Lanc- 
zos step wavefunction. 
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FIG. 2. Energy per site as a function of the forward iter- 
ation n as described in the App.(y) for the 26 sites 4 holes 
case and the 98 site 14 hole case. The value of the variational 
parameter r is also indicated. 



IV. NUMERICAL IMPLEMENTATION 



In order to put efficiently a finite number p of hamil- 
tonian powers in the SR scheme it is by far more con- 
venient to use an importance sampling strategy (see 
e.g. App|C|), by using information of the previous p — 1 
Lanczos iterations. A guiding wavefunction correspond- 
ing to the previous (p — 1) approximation V'g(^) ~'^ 



x\ Y^ akH \ipG >= rp^i{x)il;G{x) can be usedo With 

fe=0 

this new guiding function the powers of the hamiltonian 
can be put in the SR conditions by computing corre- 
sponding mixed average estimators: 



o; 



<^g\x> 



(31) 



for k = 0, ■ ■ ■ p. 

The advantage of using the SR scheme is clear even 
when we restrict this method only to the evaluation of 
the first few Lanczos iterations. In order to perform p 
Lanczos iterations, it is enough to compute only p hamil- 
tonian powers on a given configuration x. In the conven- 
tional variational method it is always necessary to com- 
pute the expectation value of the hamiltonian amounting 
to 2p + 1 powers of the hamiltonian, leading to a much 
more demanding numerical effort. It is also important to 
emphasize that within this technique the parameters ak 
defining the SR state, are given at the end of the SR sim- 
ulation n -^ oo. Once the {ak} are determined it is then 
convenient to compute correlation functions with fixed 
constants {ak}, by performing statistical averages over a 
large bin Li, without applying the SR conditions (0) as 
discussed in Sec.(^). This method significantly improves 
the statistical fiuctuations of the quantum averages over 
the variational state PsR\tpG >■ 



V. ENERGY OPTIMIZATION FOR A MANY 
PARAMETER VARIATIONAL WAVEFUNCTION 

The most important advantage of the SR technique is 
that many variational parameters can be handled at a lit- 
tle expense of solving a linear system (||) of correspond- 
ing size. The few Lanczos step technique, as we have 
discussed before, is determined only by few coefficients 
due to the difficulty for large size to compute many pow- 
ers of the hamiltonian on a given configuration x, which 
is required for the evaluation of the corresponding mixed 

estimator O^ = <ti"!'^^^ ■ 

Clearly the method discussed in Sec.(||) is not fimited 
only to the hamiltonian momenta correlation functions, 
but remains variational for arbitrary ones. In particu- 
lar many kind of correlation functions can be thought to 
be a renormalization of the guiding wavefunction ^g, al- 
lowing a powerful multi-parameter energy optimization 
similar tollj, where the variance was instead minimized. 
In the present chapter we assume for simplicity that all 
the correlation functions are used for the purpose of opti- 
mizing the variational wavefunction iJ^g, and we restrict 
our very general analysis to variational wavefunctions of 
the Jastrow-Slater forpj, £e|r a strongly correlated system 
such as the t-J modelJl30 



A. The variational Avavefunction 

The Jastrow-Slater variational wavefunction can be 
generally written as: 



\^g) = J\S)- 



(32) 



where \S) is a determinant wavefunction, that can be 
obtained as an exact ground state of a generic one-body 



hamiltonian of the bilinear formll3 Hi -body — C c . 



ctct. 



h.c. whereas the Jastrow factor 



J = exp 



(E-( 



i,j)n,nj)expC^v''{i,j)a-aj 



(33) 



introduces arbitrary Gaussian correlations between local 
charges Ui — ^^ q a-'^i,>^ ^^^ local spins along the z-axis: 
cr| — cl-^.Ci^i — cl i-Ci^i Both operators are defined on 
each configuration {x} of the chosen Hilbert space. The 
reason of the exponential form in the Jastrow wavefunc- 
tion comes from size-consistency considerations, implying 
that for macroscopic and disconnected regions of space 
A and B the wavefunction factorizes ipA,B = '4'a^'>Pb, 
this factorization being fulfilled by the exponential form. 

Considering the t—J model the restriction of no doubly 
occupied sites can be thought of an infinitely negative 
v{i,i) = — oo charge correlation, whereas restriction at 
fixed number N of particles in a finite size L, can be 
thought of another singular Jastrow term exp{—oo(Ntot — 
N)'^) where Ntot — J2i "■« i^ ^^'^ total charge operator. 
The latter two singular terms in the Jastrow factor are 
simply a restriction on the Hilbert space that can be very 
easily implemented without numerical instabilities. 

We now consider how the symmetries of the finite size 
t-J model drastically restrict the still too large num- 
ber of variational parameters in the Jastrow-Slater wave- 
function. Translation invariance implies that the Jas- 
trow potential depends only on the distance v{i,j) — 
v{\Ri — Rj\), whereas w^ = for a singlet wavefunction. 

The singlet and translation symmetry imply also 
strong restrictions to the one-body hamiltonian. defining 
the Slater determinant. This hamiltonian can be gener- 
ally written: 



Hs = 


Ho + { 


AT + A) 




(34) 


At = 


>> 

(R,r) 


,(f)(4^, 


^R+T,l + ^R.+T,T^'R.i} 


(35) 


e^o 


= E^. 


■■ A,a^Ka 


is the free electron ti] 


ght bind- 



ing nearest-neighbor Hamiltonian, ek = —2t(coskx + 
cos ky) — /x, /z is the free-electron chemical potential and 
A''' creates all possible pairs at the various distances \t\ 
with definite rotation- refiection symmetry (e.g. (1^2 _y2 
implies A(1,0) = -A(0, 1)) 

For a generic Jastrow-Slater singlet state, satisfying 
all symmetries of the t-J model, a quite large number 



of variational parameters are therefore available corre- 
sponding to v{\t\) and D{t) for all distances |r|. Not 
all these parameters are independent, namely the sub- 
stitution w(|t|) — > i'(|t|) + const, does not change the 
wavefunction up to a constant, so that w(|t|) can be as- 
sumed to be zero at the maximum distance. Analogous 
dependence exists between the various parameters A(t), 
since after projecting it at fixed number of particles, the 
ground state of the Slater determinant hamiltonian ( [3^ ) 
can be written: 



\s>={J2f{f){ 

R,T 



C^ C^ 



-W4a))''^'|0> (36) 



where /(t) is the pairing wavefunction simply related to 
A(r) in Fourier transform: 



fik) 



A{k) 



ek + y/Al + e'i 



where e^ in the above expression is not limited to nearest 
neighbor hopping. 

Thus if we scale /(r) —^ const. f{T) the many body 
wavefunction (36) remains unchanged, implying that the 
number of independent parameters is equal to N shell ~~ 1, 
where Nsheii is the number of independent shells at 
|r| > consistent with the rotation-reflection symmetry 
of /. Notice that, once in the determinant part of the 
wavefunction Nsheii — 1 variational parameters are inde- 
pendently varied, it is useless to consider other terms, 
such as, for instance, the chemical potential /i or next- 
nearest neighbor hopping in Hq: they always provide a 
suitable renormalization of / that can be sampled by the 
first Nsheii — 1 parameters. Moreover by performing a 
particle- hole transformation in Eq. (p4) on the spin down 
c|| -^ (— l)*Ci.|, the ground state of the Hamiltonian 
( p4| ) is just a Slater-determinant with N — L particlesEy. 
This is the reason why this variational wavefunction rep- 
resents a generic Jastrow-Slater state, a standard varia- 
tional wavefunction used in QMC. Using the particle-hole 
transformation, it is also possible to control exactly the 
spurious finite system divergences related to the nodes of 
the d-wave order parameter. 



Thus for Ittfcl small, (1 + EfcafeO'=)|V'G >^ J ■>< 
exp(^j, afeOj_gQ£,y)|S' > which remains a Jas- 
trow Slater wavefunction of the same form JS" 
with S' — exp{J2k'^kO'^_^QjjY)\^ > Since one 
body operators are bilinear (e.g. c^c) in fermion 
second-quantization operators' S' remains a Slater 
determinant .Ej In the Jastrow-Slater case for the 
t — J model considered here the one-body opera- 
tors read: 



'^l-BODY 



E 

R.reShelli^k 



^\''')\'^R,T'^R+T,i + ^R+T,T^R,i) 



(38) 



where the sign S{t) = ±1 is determined by symme- 
try. Also the bar kinetic energy Hq is considered 
in this approach. According to the previous dis- 
cussion the chemical potential /z is fixed to the free 
electron one in Hq. 

• the second class of correlation functions are the 
ones that appear in the Jastrow factor. They 
are the diagonal operators O^ - density-density 
J2r riRnji+r or spin-spin J^r '^r'^r+t ~ i^ t^^ cho- 
sen basis X of configurations with fixed spins and 
electron positions. Again for small a they can be 
considered a renormalization of the Jastrow factor: 
J^ Jexp(S^afcO'=). 

The multi-parameter minimization method can be 
summarized as follows: 

• after each reconfiguration the factor r^ — J2k '^kO^ 
is computed with given a^, whose statistical fluc- 
tuations can be arbitrary reduced by increasing the 
number of walkers or the bin length Lf, as described 
in Sec.(O). In this case of non linear optimization 
the bin technique is particular important because it 
allows to avoid, for large enough bin length Lf,, un- 
physical fluctuations of the guiding wavefunction. 

After an exact Green function step a wavefunction 
better than ipc is obtained and is parametrized by 
the coefficient ak contained in the factor r^ : In fact 



B. Stochastic minimization 

Among the correlation functions important to define 
the variational wavefunction two classes are im por tant 
for guiding functions of the Jastrow-Slater form (|3^). 

• the first class of operators renormalize the Slater 
detecBiinant and have been identified by Filippi and 
Fahyliy, O'' are defined by means of one body op- 
erators 0'^_gQjjY by the following relation: 






< xlO^ltpc > 

< x\ljJG > 



< ^Oi-body\^ ^ 



<x\S> 



(37) 



PsR^'>Pn+l = PsRGPsRipn = T^j^ll^aix) 

has to be by— definition a variational state better 
than PsRil^n^i which in turn is better than ipc, 
since for instance we can assume ^"71=0 = '0G- 

• It is therefore convenient to change at each iteration 
n the guiding wavefunction: 



p 



IV'g >^ exp{\ akO^)\^G > 



fc=i 



(39) 



In the above equation we have introduced new 
scaled coefRcients ak — a.k/C simply related to the 



ones ak defined by the SR conditions (g). This 
is obtained by recasting r^ in a form that is more 
suitable for exponentiation: 



C 



k=l 



afc(0^--0* 



(40) 



where 



Qk 






-« w'.0^5x,Xj » 



and C = 1 + J2t=i^kO'^. The above exponen- 
tiation is justified, provided tt^^ — tt^tt- This 
is certainly fulfilled at equilibrium when for large 

"- "^n+i = PsriJm+i oc PsRipn oc ^G' "^G being 
the Jastrow-Slater wavefunction with lowest energy 
expectation value. In fact at equilibrium the SR 
conditions turn exactly to the Euler equations of 
minimum energy for ^q: 



<^^io^iv&> _ <rG\o''H\rG> 



< V'gIV^g > 



< ^*g\H\^*g > 



(41) 



as implied by (^) for V'n+i — 'i'h ^^^ i'n+i = 
Gip'j^ ex GipQ, and taking also into account that 
here for simplicity O" is the identity. 

This implies that V'g is a lowest energy wavefunc- 
tion of the Jastrow-Slater form. There maybe many 
local minima when the Euler's equations are iden- 
tically satisfied. In this case the SR represents a 
very useful tool for global minimization. In fact 
the bin length Lh or the number of walkers M rep- 
resent at each iteration n an effective inverse tem- 
perature that can be increased gradually following 
the well known "simulated annealing" statistical 
algorithm.LJ As it is shown in Fig.(|^) we apply this 
technique with very short bin length using a full 
translation invariant singlet wavefunction and us- 
ing x,y reflection symmetries without rotation sym- 
metry. This amounts to 24 independent param- 
eters for a 26 site cluster {Nsheii — 13). In the 
plot we show the evolution of the short range BCS 
parameters A(l, 0), A(0, 1) ,when at the beginning 
rpn=o — i^G was set with no Jastrow term v — Q 
and the s-wave symmetric determinant defined by 
A(1,0) = A(0, 1) = O.lt, A^ = for |r| > 1. The 
s-wave symmetric solution is locally stable, but for 
short enough bin there is a finite tunneling prob- 
ability to cross the barrier and stabilize the much 
lower energy solution with d-wave symmetry. 
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FIG. 3. Energy per site and evolution of the pairing ampli- 
tudes for 4 holes in the t-J model at J /t = 0.5 as a function 
of the stochastic power method iteration n. Here 200 walkers 
were used. Upper panel: triangles and circles denote nearest 
neighbor A(r) amplitudes in Eq.(|3J) along x and y directions 
respectively. Lower panel: the arrows indicate the power it- 
erations when the Monte Carlo bin length Lb has increased 
from 10 to 100 (left arrow) and from 100 to 500 (right arrow) 
Monte Carlo steps. 

• In order to continue with the new guiding wave- 
function (p9) without another long equilibration, 
walker weights Wj and w^- in Eqs.(p|,pO|) can be 



re weighted as follows: 



Wj ^Wj{il)'a{xj)/^G{xj)) 
y^f ^ wJ(V;^(xj)/V'G(a;j))^ 



(42) 



in order that the new weights acting on the same 
configurations x represent statistically Eq,(ph with 
ipf = ipQ = ijj'i^ and Eq.(|lO|) with the new guiding 
wavefunction rpQ. 

• For large n the one-body operators correspond- 
ing to the Slater-determinant may became lin- 
early dependent because D may approach an eigen- 
state of a one body hamiltonian ^hkO'^_gQjjYi 
J2^kO^^_gQjjY\S >~ E\S >, with suitable con- 
stants hk and E. Thus the covariance matrix Sk,k' 
quickly become singular, leading to a numerical in- 
stability which is difficult to control statistically. A 
stable method to oveccome this difficulty was found 
by Filippi and Fahytj, essentially by taking out one 
operator (the one body kinetic energy) from the 
ones used in the linear system (pi). Here we have 
found more convenient to solve the linear system 



y^ Sk.k'Olk' =< 1pG\0''G\iln >= Jk 



by diagonalizing first the symmetric matrix 



and taking out the lowest eigenvalue Ao > com- 
ponent of the positive definite matrix s. This 
is equivalent to select in the SR another set of 
p — I operators which are no longer singularly de- 
pendent. This is a perfecily legal operation for 
Aq ~ 0, since as shown ina, for the singular op- 
erator O* = J2kUkfiO'', such that 0*|V'g) = 0, 
the SR condition (0) is identically satisfied. Thus 
the resulting linear system is no longer affected by 
the above numerical instability: 



A,ar 



k 



Ukjfk 



with u'q — and 



ak 



= Y.u^'. 



j^j 



(44) 



(45) 



i>o 



Finally we obtain a much more stable determina- 
tion of ak, which does not affect the result at the 
equilibrium, where Aq — + and the correct Euler 
equations are satisfied. With this scheme also the 
optimization of the Jastrow parameters together 
with the Slater determinant ones is possible with- 
out too much effort. 

We have found that the generic situation for 
Jastrow-Slater wavefunctions is that the optimal 
determinant is actually the ground state of a one 
body hamiltonian Hi^body = '}2khkO'l_gQj^Y^ 
a particular linear combination of the chosen one 
body operators used in the SR conditions. |-This is 
in agreement with the Filippi-Fahy ansatzll3. Oc- 
casionally however, the optimal determinant turns 
out to be an excited state of a one body hamilto- 



the region J /t ^0.5 where pairs with d-wave symmetry 
repel each other and do not form many-particle bound- 
states as is the case for large J/t when phase separation 
occurs. 

For the 32 site cluster the spin-translation symmetry 
has to be explicitly broken in the variational wavefunc- 
tion by introducing a staggered magnetization Hq — > 
Hq — /S.AF T1ir{~^)^'^r along the z axis. The best wave- 
function compatible with reflection-rotation and trans- 
lation with spin interchange (T'*^!) can be conveniently 
parametrized by I^af wd a next neighbor hopping t'ln 
the Slater determinantcZI and also a spin Jastrow factor 
is allowed within this class of wavefunctions. Even in 
this case, as shown in the corresponding Fig(^), the hole- 
hole correlations are almost exactly reproduced by the 
strong attractive Jastrow term at long distance. This 
means that in this small doping regime it is important 
to have a broken symmetry ground state, which suppress 
the d-wave BCS order parameter. 
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FIG. 4. Hole-hole correlations (upper panels) and Jas- 
trow-Slater parameters (lower panels) for the optimal varia- 
tional singlet wavefunctions with pairing wavefunction / with 
d-wave symmetry in a 26 site cluster t-J model 



In Fig.(M) and Fig.(@) we show the full Jastrow-Slater 
optimization for the t-J model in the largest size cases 
where the exact solutiotuis known 4 holes in 26 sitesE3. 
and 2 holes in 32 sitesO, respectively, We display the 
hole-hole correlation functions N{R) =< (1 — no)(l — 
nn) > on the variational wavefunction with and with- 
out the Jastrow factor. We see that the improvement 
towards the exact solution is crucially dependent on the 
Jastrow density-density factor especially at long range 
distance. This behavior seem to be analogous to the one 
dimensional case where long range Jastrow-factors are 
enough to determine the anomalous long range behav- 
ior of cojxelation functions in one dimensional Luttinger 
liquidscjO. The remarkable accuracy of the Jastrow- 
Slater wavefunctions is clearly limited (see e.g. Fig.n^) to 
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FIG. 5. Same as in Fig.W for the 32 site cluster, with 
a broken symmetry Slater determinant wavefunction with 
Aaf = 0.2i Abcs = O.lf and t' /t = -0.15. 

The next step is to perform few Lanczos steps over 
these variational wavefunctions which have been shown 
to be very accurate but not an exact representation of 
the ground state many body wavefunction. In the singlet 
case with no broken symmetry the energy as a function 
of the variance of the energy per site: 



<^g\{H/L)^\^jg> (<MH/L\'4,g> 



< '0g|V'g > 



< '0g|V'G > 



(46) 



is indeed smoothly related to the exact ground state en- 
ergy. The reason is that for a good variational state the 
variance approaches zero as the energy becomes, exact, 
by increasing the number of Lanczos iterations .l3 
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FIG. 6. Energy per site as a function of the number of 
Lanczos steps p = (higher energy-variance) p = 1 (medium 
energy- variance) and p = 2 (lowest energy with non zero vari- 
ance) starting from the optimal Jastrow-Slater wavefunction 
for 4 holes in the 26 site clusters and several J/t. Only for 
J /t — the broken symmetry solution (with the spin Jastrow 
factor w^ 7^ 0) is better than the best singlet wavefunction 
(triangles). The arrows indicate the exact energies, whereas 
the zero variance energies are the extrapolated results with a 
quadratic fit (continuous lines). For J/t = 0.5 we show the 
corresponding variational p — energy and variance for the 
wavefunction without Jastrow term. 

Whenever on a finite system a broken symmetry vari- 
ational solution has much lower energy than the fully 
symmetric one, the Lanczos method is less effective to 
extrapolate to the exact value. This is shown for instance 
for the 26 sites at J/t — Fig.(^ or in the 32 site 2 hole 
case Fig.(|^). In the latter case we show also the energy 
as a function of the variance for the fully symmetric solu- 
tion. We see in this case, that the approach to the exact 
solution for a singlet wavefunction is rather difficult but 
indeed possible. 

On a finite system there is always a small energy gain 
to recover a state with definite spin. It is very difficult to 
obtain this residual energy with few Lanczos step itera- 
tions, since in order to average over the various directions 
of the order parameter many hamiltonian power itera- 
tions are required. However this residual energy should 
vanish in the thermodynamic limit if the symmetry is in- 
deed spontaneously broken. Therefore we conclude that 
the small discrepancy between exact results and extrap- 
olated ones in Figs.(p,R) is irrelevant within the above 
assumption, which is confirmed by simulations on much 
larger size, showing antiferromagnetic long range orderj^ 
small dopinglD and ferromagnetism for the J = caseo. 
In any case the variance extrapolation with few Lanczos 
steps provide always a much more accurate estimate of 
the exact ground state energy compared to the lowest 
variational estimate. 
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FIG. 7. Energy per site as a function of the number of 
Lanczos steps p = (higher energy with non zero variance) 
p = 1 (medium energy with non zero variance) and p = 2 (low- 
est energy with non zero variance) starting from the optimal 
Jastrow-Slater wavefunction for a 32-site t-J square lattice. 
The arrows indicate the exact energies, whereas the zero vari- 
ance energies are the extrapolated results with a quadratic 
fit (continuous lines). The broken symmetry solution is de- 
scribed in Fig.(B|) whereas the best singlet wavefunction is 
obtained by optimizing only the density-density Jastrow and 
the d-wave parameters as in Eq.(B3). 



VI. RESULTS AND DISCUSSION 

The variational approach is certainly limited and "bi- 
ased" by the "human" choice of the variational wave- 
function, "believed" to be the correct one for the physi- 
cal problem considered. In this paper we have described 
a variational approach that improves systematically any 
given variational wavefunction with a couple (and in prin- 
ciple more) Lanczos steps, with a reasonable computa- 
tional effort. This approach is certainly limited, espe- 
cially for large size, when few Lanczos iterations cannot 
remove the possible large bias of the initial variational 
guess. However for 2D fermionic system on a lattice, in 
the strong correlation regime, i.e. close to a Mott insu- 
lator state, it is very difficult to improve the best vari- 
ational wavefunction obtained with the Lanczos scheme 
(see table). This result is particularly meaningful if we 
consider that in principle the FN technique is size consis- 
tent (lowers the energy per site of the variational guess 
even in the thermodynamic limit) and the LS technique 
with fixed number of iterations p is not. Thus, close 
to a Mott insulator, it is very important to change the 
nodes of the wavefunction -that the Lanczos technique 
allows- rather than improving only the amplitudes- as 
in the FN technique. It is worth mentioning, however, 
that the FN energy reported in the table is only an up- 
per bound of the expectation value of the hamiltonian on 



the FN state. We do not know exactly how much this 
will lower the energy in favor of the FN technique, but 
we expect that this should be only a minor correction, 
especially for large sizes. 

As shown in the table the variational energy can be 
further improved by using the r > technique, which, 
within the formalism of the present paper, can be thought 
as a self-consistent improvement of the amplitudes and 
the nodes of the few Lanczos step variational wavefunc- 
tion. For r — 1 the original SR scheme is recovered, 
which as seen in Fig.(||), maybe iKrf; the optimal choice 
from the variational point of view.Elfl. As far as the vari- 
ational energy is concerned the self-consistent approach 
(r > 0) is not very effective (the improvement is between 
20% and 30% on small size systems) for small system 
sizes but appears to be more and more important as 
the size is increased. Also correlation functions maybe 
qualitatively improved (Fig.nl) by the self-consistent ap- 
proach, especially when a many-particle bound state ap- 
pears ("stripes" or phase separation in this model) which 
is not contained at the variational level. 

An important advantage of the standard variational 
approach (r — 0) is that the error in the ground state 
energy and correlation functions can be estimated using 
that the variance of the energy per site (Eq) in an exact 
calculation should be zero. The variance can be esti- 
mated systematically with high statistical accuracy for 
the first p Lanczos steps acting on the initial variational 
wavefunction ^q. We show that the approach to the ex- 
act result maybe smooth, even for large system size and 
number of electrons, even when ipc is not particularly 
close to the exact result. Obvious exceptions exist and 
are shown here in FigJ^ for correlation functions, whereas 
the energy seems always better behaved (see Fig.||). 

We have tested this simple scheme in the 2D- 
Heisenberg model where an exact solution of energies and 
correlation functions is easily available by using standard 
techniques. The 2D-IIeisenberg model has off-diagonal 
long-range order in the ground state, the order parame- 
ter 



^ Yl ^" 



L2 



Sb'{-1) 
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being finite in the thermodynamic limit L — > cxd. We 
start with the variational wavefunction in Eq.(p3) ob- 
tained by projecting out the doubly occupied states 
to a wavefcmction with d-wave n.n. BCS pairing 
correlationsE2l, but without any explicit antiferromag- 
netic order parameter. 

This wavefunction represents an accurate wavefunction 
for the quantum antiferromagnet as far as the energy 
is concerned, but certainly has not the right behavior 
at large distance, and maybe |eonsidered an RVB disor- 
dered variational wavefunctionllH After applying only two 
Lanczos iterations the 18 site size is almost exactly repro- 
duced by this simple wavefunction, showing that at short 
distance the quantum antiferromagnetic wavefunction is 
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almost indistinguishable from an RVB one.Bll As we in- 
crease the size, the variational energy calculation (see 
Fig.g) looses clearly accuracy, since the gap to the first 
excited state scales to zero and the Lanczos algorithm 
becomes correspondingly less effective. This loss of ac- 
curacy is however not dramatic (asymptotically it scales 
as -^f ), as shown by the similar quantitative agreement 
with the exact result obtained both with the 50 and the 
98 site lattice (Fig.i). 




FIG. 8. Energy per site of the finite-size Heisenberg model. 
Comparison of exact results (indicated by arrows) and the 
approximate p = 0, 1, 2 Lanczos step iterations over the pro- 
jected d-wave wavefunction. Continuous lines are quadratic 
fit of the data. 

Also remarkable is the correct trend of long range cor- 
relation functions as we increase the number of Lanczos 
iterations. As shown in the inset of Fig.(y) the linear 
extrapolation with the variance, which is valid for corre- 
lation functions averaged over the system volume ( see 
AppO), is capable to detect almost the right long range 
magnetic order in the Heisenberg model. This is remark- 
able if we consider that the starting wavefunction ipQ 
is disordered, as also shown in the same inset. We re- 
mark alsa,|that in this particular case the original SR 
algorithrmu for r = 1 and A ^ cxj is exact since the nodes 
of the variational wavefunction are the correct ones. 
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FIG. 9. Order parameter m — yj S{'k,'k)/L in the fi- 
nite-size Heisenberg model {Siji, vr) being the spin isotropic 
antiferromagnetic structure factor). Comparison of exact re- 
sults (indicated by arrows) and the approximate p = 0, 1, 2 
Lanczos step iterations over the projected d-wave wavefunc- 
tion. Continuous lines are quadratic fit of the data. Inset: 
finite-size scaling with the variational (BCS d-wave) wave- 
function and with the variance extrapolated one. 

As anticipated the estimate of the variational error, by 
using that the variance has to approach zero in an ex- 
act calculation, is really effective in this case, and shows 
that , even for large size it is possible to reach very good 
quantitative estimates of energies and correlation func- 
tions (seecj), even when the starting wavefunction is not 
particularly close to the exact one. In the Lanczos al- 
gorithm the variance becomes *zero only when the lowest 
energy state non orthogonal to the initial wavefunction is 
reached. However we expect that the variance may reach 
very small values close to a " quasi eigenstate" , implying 
the failure of the variance variational error estimate. In 
fact, in this case the energy optimization Lanczos tech- 
nique is trapped in a local minimum, with no possibility 
of tunneling to the true global minimum energy, (see 
fig®- 
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FIG. 10. Energy per site vs. variance for two different ini- 
tial wavefunctions (highest energy dots) after applying one 
(medium energy points) and two Lanczos steps (lowest ener- 
gies points) Lines are quadratic fit to the data. 

Clearly this is a well known problem in numerical opti- 
mization and the only possibility, when the exact solution 
is not known, is to check the various candidates for the 
ground state, and determine the one with the lowest en- 
ergy. From another point of view, this property of the 
Lanczos algorithm maybe useful to estimate physically 
observable quantities, such as the condensation energy 
for a metal to became superconductor, which is just the 
macroscopic difference in energy between two thermody- 
namically stable states. From Fig.(nG) an estimate of 
this condensation energy is 0.02i ^ lOOiir, injeasonable 
agreement with experiments on HTc cuprate£3, suggest- 
ing that the main features of d-wave superconductivity 
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can be understood with this simple model. 

At the moment, the approach we have presented here 
has to be limited to very few Lanczos iterations for large 
size with the given computer resources. Nevertheless it 
is certainly systematic and unbiased as far as the ap- 
proach to the ground state is concerned: the corrections 
to the initial guiding function depend only on the hamil- 
tonian H and no other biased approximation. Compared 
to the standard FN technique, it allows a systematic im- 
provement of the starting variational wavefunction ij^q, 
by correcting not only its amplitudes but also the nodes. 

The extension of this technique to continuous models is 
straightforward. For the reference dynamic (given by G^ 
in_Eq.El|) one can use the Langevin dynamic, as it is done 
ir£3, so that it is possible to determine the lowest energy 
state obtained by applying to ■0g few Lanczos steps with 
(r > 0) or without self-consistency(r = 0) or by using 
a many parameter variational wavefunction, with a non 
linear optimization as described in Sec.(M). 
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APPENDIX A: PROPERTIES OF THE 
OPERATOR PsR (AJ - H) Psr 

In this appendix we focus on some properties of the 
matrix Psr (A/ — H) Psr'- defined in Eg. (po|) . 

• the maximum eigenvalue A — Esr of the Hermi- 
tian matrix Psr (A/ — H) Psr is certainly smaller 
than the corresponding one A — £^0 of the ex- 
act Green function A/ — H . In fact be ipSR = 
Psr '4'SR the normalized eigenstate of Psr (A/ — 
H)PsR with eigenvalue Esr, then P^^ = Psr 
implies that Esr =< iI^srWsrH Psr\-4>sr >=< 
'4'Sr\H\iPsr >> ^0; as ipsR can be considered a 
variational state of the exact hamiltonian H with 
energy Esr. 

• Since ^jsr. is obtained by applying 

[Psr (A/ — H) Psr] " to a given trial wavefunc- 
tion. For n — > 00 such propagated wavefunction 
will converge therefore to the lowest energy state 
in the subspace projected by Psr- This implies 
clearly that Esr^ is lower or at most equal than the 



variational energy on the reference wavefunction: 
< V'gI^IV'G > simply because -^g belongs to this 
subspace. 

• Since f/'G belongs to the subspace projected by 
Psr, < MHPsR =< ^g\PsrH. Therefore 
the mixed average estimate-statistically much more 
convenient- 



Ema — 



<iI;g\H\'iJjsr> <i^G\PsRHPsR\iJsR> 



< -IpoHsR > 

Esr 



< -ipalPsRlipSR > 



, coincides with the variational bound Esr of the 
ground state energy, as ipSR. is an exact eigenstate 
of Psr H Psr with eigenvalue Esr- 



APPENDIX B: FORWARD WALKING 

In order to compute correlation functions over V'S-R it 
is necessary to use a slight generalization of the forward 
walking technique, generalized to non-symmetric matrix 
such as (pq). Moreover since in the meaningful SR limit 
of large number of walkers or bin length Lb —> 00 the 
parameters ak can be assumed constants in rx (]^, it is 
much more convenient to implement the forward walking 
technique without allowing any fluctuations of the ran- 
dom variables ak - This can be done easily by first evalu- 
ating the expectation value ratios ak =< ak > / < ag > 
k — 1, . . .p with the standard SR algorithm , i.e. allow- 
ing the ak fluctuations for each Markov iteration n. The 
second step is to perform a different simulation, usually 
much more efficient as far as the error bars are concerned, 
with Vx determined by non random constants ak'- 



Y^akOl 



(Bl) 



k=l 



If the Oik are determined accurately for M, Li, large the 
SR conditions (^ will be automatically verified within er- 
ror bars. The statistical or systematic error related to the 
determination of the constants ak is also not much impor- 
tant. In fact even assuming that with the first simulation 
the constants ak are determined with a non-negligible 
statistical error and an unavoidable systematic bias due 
to the finite number of walkers, the method that we will 
describe in the following will provide also in such a case 
a variational estimate of the energy with the chosen con- 
stants ak - The analogy of this method with the Lanczos 
method is evident also in this case. Even in the latter 
technique, a first run is usually implemented to determine 
the coefficients of the ground state dck in the Krilov basis 
spanned by the initial wavefunction ipc and the powers of 
the hamiltonian applied to it "00 = '0G+ X) oi.kH^\'(pG > 

k—l.p 

(with some more technical ingredient to work with an or- 
thonormal basis) . Then correlation functions over -00 are 
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computed by recovering the ground state wavefunction 
in this basis using the determined coefficients cxk- 

Let us now focus on the implementation of the for- 
ward walking technique within the SR scheme at fixed 
constants ctk- Since G-^ in Eq.(Eq) is not symmetric its 
left eigenvector < ip^l (x lim < tpcKG^)^ does not nec- 

n — *oo 

essarily coincide with the corresponding right eigenvector 
ipu- Fortunately the matrix G-^ can be easily written in 
terms of a symmetric matrix G^ 



G: 



'/ _ 



'^x'x/'^ 



with 



IV'G(x)||r,r/2 



Gx',x — , \^Sx',x — Hrc\x\ 



(B2) 

(B3) 
(B4) 



Therefore the right and the left eigenvectors of G^ are 
easily written in terms of the maximum eigenstate 0o of 
the symmetric matrix Gq, namely ')pii{x) — ax4>oix) and 
'>Pl{x) = (j)o(x)/a(x). Then using the definition of the 
SR state (|2^) it follows that also the left eigenvector of 
G-^ can be written in terms of "ipSR- 



with Lx = i]G{x)rx/zx 
and Rx = \rx\^^'' /ipaix) Sgnr^ 



(B5) 
(B6) 

(B7) 



After applying several times the Green function G^ 
the walkers w^ ,x determine the state iI}r{x). Then it is 
possible to evaluate expectation values of any operator 
O with given matrix elements Ox' ,x by applying the fol- 
lowing relation, which correspond to propagate n times 
forward 'ipfi{x): 



< '4'sr\0\')Psr > 



lim 



j:x'^'x'JGfrx".x'Ox',xMx) 



< ^SRimSR > "-- j:x'^',x'JGfy^„^^,Ix'^xi^R{x) 

(B8) 

where the matrix elements of O and the identity / are re- 
placed by the ones of the left-right transformed matrices 
O and / respectively. The explicit matrix elements of O 
and / in the RHS of the above equation are given by: 



^x'.x — ^x'^x'x^x 



h 



^x^x^x' 



x-^^x^x' ,x 



(B9) 
(BIO) 



This means that in the standard forward walking 
technique!], instead of using the importance sampled 
matrix ele ments obtained with L^ = ipcix) = l/Rx 
in Eq.(B9), the slightly more involved ones (B9 B1C| ) 
have to be considered. In fact by simple substitutions 
of these matrix elements into Eq.(B8), using also that 
E'x'iG^)x',',x' « Mx') = ^sr{x')/Lx. (H and that 
iiR{x) = ■ipsR{x)/Rx @, Eq.pi) is easily verified. 



The statistical algorithm used to evaluate the ratio in 
Eq.(B8) is verw similar to the standard "forward walk- 
ing" techniqueQ for diagonal operators. The few differ- 
ences are: 



• also the denominator in Eq.(B8) has to be "for- 
ward" propagated for n iterations, since in this 
case the diagonal elements of / are not trivially one 
(since L^ ^ Rx^)- The error bars have to be then 
calculated taking into account that the numerator 
and the denominator are very much correlated. 

• off diagonal operators can be computed without 
performing another simulation, provided the ma- 
trix elements of the operator O are contained in 
the non vanishing ones of the Green function G (or 
some power of G if the operator is evaluated statis- 
tically). In particular the expectation value of the 
hamiltonian and the even more accurate ones (El) 
can be computed altogether with a single Markov 
chain. 

• similarly the accuracy of diagonal and off diagonal 
operators can be further improved by computing 

<7/.gj,|G^-0G^|V;sfl> 
<4^sr\G^^\'<Psr> ■ 

In fact an important advantage of the SR tech- 
nique is that the reference Green function G;J,, ^ 
is non zero for all non-zero elements of the exact 
Green function G, (whereas in the FN technique 
the matrix elements with negative sign are sup- 
pressed). Thus the exact sampling of the Green 
function G can be done with the standard reweight- 
ing method, requiring only the finite multiplicative 
factors Sx' ,x = Gx'.x/G{^, ^, calculated for each it- 
eration n and each walker of the Markov chain. 
The same technique can be obviously generalized 
when the reference Green function G^ is the fixed 
node one-slightly generalized to have the possibil- 
ity to cross the nodeaS-, simply replacing G^ and 
Za; = 1 in the above expressions. However the sta- 
tistical accuracy for the determination of the con- 
stants {ak} is vary bad with the FN reference G^ , 
about an order of magnitude less efficient than the 
Eq.(pi|) one, without a significant improvement in 
variational energies. The reason of such bad be- 



havior (or the successful one for 21) is not clear at 
present. 



APPENDIX C: EFFICIENT CALCULATION OF 
THE SINGLE LANCZOS STEP WAVEFUNCTION 

In this Appendix we describe an efficient way to find 
the optimal LS wavefunction \')pa) — {^+o:H)\^p) , starting 
from a chosen variational guess \ip), i.e., to calculate the 
value of a for which the energy 
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(^g|(1 



aiJ)2|V;G) 



has a mininium. A standard method is to calculate sta- 
tistically the various powers of the hamiltonian 



K 



(V'gIV'g) 



(C2) 



using configurations x generated by the Metropolis al- 
gorithm according to the weight ipd^)'^- This method 
is however inefficient since much better importance 
sampling is obtained when configurations are instead 
generated according to the optimal Lanczos wavefunc- 
tion tpaiix) = (1 + tt*ei/'G(^)) V'G(a^), where e^(x) = 
(■iMaSr ^^ ^^^ local energy corresponding to a generic 
guiding wavefun ctio n ■0, and a* minimizes the above ex- 
pectation value (CI) for a ^ a* . This wavefunction ipa* 
maybe much better leading to much lower variances es- 
pecially for the higher momenta /i2 and h^. 

In this Appendix we describe an efficient way to find 
the optimal LS wavefunction IV'a*), starting from a cho- 
sen variational guess \tpa) with energy: 



E{a) 



hi + 2ah2 + a^h^ 
1 + 2ahi +a^h2 



(C3) 



easily written in terms of the energy momenta hn ■ 

In order to minimize ( |C3| ) , given an arbitrary value of 
a, it is convenient first to compute the energy expectation 
value hi with the standard statistical method and then, 
in place of the remaining hamiltonian higher momenta /12 
and ft.3, generate statistically configurations according to 
■!/'q(x)2 and compute: 



E{a) 



X- 



{i>a\lpa) 

(V'a|l/'a) 



E{a) is obtained by averaging over the chosen config- 
urations the local energy corresponding to ipa, namely 
< e^^ > whereas x is obtained by averaging over the 
same configurations < (1 -I- ae^g(x))^^ >. Given x it is 
straightforward to compute 

/i2 = [(x"^-2)(l + a/ii) + l]/«2 

, and therefore given hi and /12, the value of 
E(a) implicitly defines the highest momentum h^ = 

E{a){l+2ahi+a'^h2)-hi-2ah2 



Notice that the most diffi- 
cult energy momentum h^ is given by sampling an en- 
ergy expectation value, which is by far statistically more 
accurate compared to the direct determination of h^ . 

It is then possible to minimize analytically E(a), yield- 
ing: 



where the above sign ± is such to mi nimi ze E(a*). 

The analytic minimization of E{a) ( |C3| ), given the val- 
ues of X, /iiand E{a) itself, provides the exact value of 



a' in Eq. (C4) within the statistical uncertainties. They 
become smaller and smaller whenever a ^ a* . Typically 
two or at most three attempts are enough to reach an 
accurate determination of a* when the condition: 



1 



X 



l + a*E{a* 



(C5) 



is exactly fulfilled. This condition is true in general only 
for the eigenstates of the Hamiltonian, but remains valid 
for the single Lanczos step wavefunction. 



APPENDIX D: VARIANCE ESTIMATE OF THE 

ERROR ON "BULK" CORRELATION 

FUNCTIONS 

In this appendix we estimate the error on correlation 
functions assuming that the ground state |'0o > is ap- 
proximated with the wavefunction i/'p > distant e^ from 
I ■00 > Namely, with no loss of generality we write: 



IV'o >= iV'p > +ep|V'' > 



(Dl) 



with < V'plV'p >=< ip'W >= 1) V'') representing a 
normalized wavefunction orthogonal to the exact one 
< "00 IV'' >= 0. We restrict our analysis to thermo- 
dynamically averaged correlation functions O, the ones 
which can be written as a bulk average of local opera- 
tors Or: O = j- ^^ Or. This class of operators include 
for instance the average kinetic or potential energy or 
the spin-spin correlation function at a given distance ti 
Or = Sr, ■ Sr+t ■ If we use periodic boundary conditions 
the expectation value of Or on a state with given mo- 
mentum does not even depend on R and the bulk average 
does not represents an approximation 



< ^oIOrIv-o > < MO\i'o > 



< i/'o|'0o > 



< i>o\ll^Q > 



c. 



(D2) 



We show here that the expectation value of bulk av- 
eraged operators O on the approximate state ipp satisfy 
the following relation: 



< VpPIV'p >- C + Oiel ep/VI) 



(D3) 



-{h3 - hih2) ± y/ihs - hih^Y - 4{h2 - hDihihs - hi) 



thus i mply ing that for large enough size the expectation 
value (D3) approaches the exact correlation function C 
linearly with the variance. This allows to obtain a good 
accuracy with a good variational calculation, that is not 
easy to obtain if a term ~ e^ dominates. 

The validity of the above statement is very simple to 
^ow under very general grounds. In fact by definition: 



2(/ll/l3 



hi) 



< V'p|O|0p >^C + 2ep < 0'|O|0o > 



(C4) 



el < i,'\0\i,' > 
(D4) 
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The term proportional to e in the above equation can be 
easily bounded by use of the Schwartz inequality: 

I < 7^'IOlV'o > P = I < V/|0 - C7|Vo > P << MiO - Cf 

(D5) 

The final term in the latter inequality can be estimated 
under the general assumption that correlation functions 
C{t) =< {Or - C){OR+r - C) > decay sufficiently fast 
with distance |r|, as a consequence of the cluster prop- 
erty: 



<Vo|(0-C)2|^o>=(l + e') 



iE 



C{t). 



This concludes the proof of the statement of this ap- 
pendix, provided ^^ C{t) is finite for L -^ oo. 
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iV L J/t VMC VMC+LS VMC+FN VMC+2 LS VMC+LS +FN Best SR Best r Exact 



-0.6277(1) -0.6381(1) 

-0.7759(1) -0.7852(1) 

-1.1608(1) -1.1719(1) 

-0.4611(1) -0.46522(3) 

-0.6777(1) -0.6865(1) 



22 26 0.3 -0.6138(1) -0.6332(1) 

22 26 0.5 -0.7647(1) -0.7812(1) 

22 26 1.0 -1.1476(1) -1.1672(1) 

30 32 0.3 -0.4543(1) -0.4628(1) 

84 98 0.4 -0.6653(1) -0.6807(1) 

50 98 0.4 -0.9656(1) -0.9832(1) -0.98225(5) -0.9886(1) 

TABLE I. Energy per site in the t-J model for various vari- 
ational methods. VMC is the standard variational method, 
VMC-(-LS is obtained by applying to it a Ltmczos step, 
VMC-I-FN is the lattice fixed node approachl'Q, VMC-l-2 
LS indicate the two Lanczos step variational wavefunction, 
VMC-I-LS-I-FN, is the fixed node over the VMC-I-LS wave- 



-0.6371(1) 


-0.6387(1) 


0.375 


-0.64262 


-0.7841(1) 


-0.7855(1) 


0.25 


-0.78812 


-1.1706(1) 


-1.1724(1) 


0.25 


-1.17493 


0.46524(3) 


-0.4661(1) 


0.375 


-0.470175 


0.68530(5) 


-0.6879(2) 


0.1 


-0.692(1) 


0.98781(6) 


-0.9901(2) 


0.1 


-0.9920(5) 



^SR >, computed by 
by optimizing the 



function, and the "Best r" is < psisR\H[ 
"forward walking" as described in App.(E 
parameter r. The exact energy values for the largest size was 
estimated by the variance extrapolation. On the 98 sites,the 
FNLS computation takes 10 hour CPU time on a Pentium-II 
400MHz, whereas the VMC-I-2LS wavefunction takes about 
40 hours with M = 500 walkers for a statistical accuracy of 
10~4f on the energy per site, the "best SR" another factor 
8 more due to the forward walking. The computation of di- 
agonal correlation functions instead takes a similar amount 
of time for all the methods, thus it is safer to compute them 
with the best variational method. Error bars are indicated in 
brackets. 
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